Robust Subsampling ∗
نویسندگان
چکیده
We characterize the robustness of subsampling procedures by deriving a formula for the breakdown point of subsampling quantiles. This breakdown point can be very low for moderate subsampling block sizes, which implies the fragility of subsampling procedures, even when they are applied to robust statistics. This instability arises also for data driven block size selection procedures minimizing the minimum confidence interval volatility index, but can be mitigated if a more robust calibration method can be applied instead. To overcome these robustness problems, we introduce a consistent robust subsampling procedure for M-estimators and derive explicit subsampling quantile breakdown point characterizations for MM-estimators in the linear regression model. Monte Carlo simulations in two settings where the bootstrap fails show the accuracy and robustness of the robust subsampling relative to the subsampling.
منابع مشابه
Robust Watermarking based on Subsampling and Nonnegative Matrix Factorization
This paper presents a novel robust digital image watermarking scheme using subsampling and nonnegative matrix factorization. Firstly, subsampling is used to construct a subimage sequence. Then, based on the column similarity of the subimage sequence, nonnegative matrix factorization (NMF) is applied to decompose the sequence. A Gaussian pseudo-random watermark sequence is embedded in the factor...
متن کاملA Subsampling Method for the Computation of Multivariate Estimators with High Breakdown Point
· _ All known robust location and scale estimators with high breakdown point for multivariate sample's are very expensive to compute. In practice, this computation has to be carried out using an approximate subsampling procedure. In this work we describe an alternative subsampling scheme, applicable to both the Stahel-Donoho estimator and the estimator based on the Minimum Volume Ellipsoid, wit...
متن کاملEfficient Subsampling for Training Complex Language Models
We propose an efficient way to train maximum entropy language models (MELM) and neural network language models (NNLM). The advantage of the proposed method comes from a more robust and efficient subsampling technique. The original multi-class language modeling problem is transformed into a set of binary problems where each binary classifier predicts whether or not a particular word will occur. ...
متن کاملSubsampling tests of parameter hypotheses and overidentifying restrictions with possible failure of identification
We introduce a general testing procedure in models with possible identification failure that has exact asymptotic rejection probability under the null hypothesis. The procedure is widely applicable and in this paper we apply it to tests of arbitrary linear parameter hypotheses as well as to tests of overidentification in time series models given by unconditional moment conditions. The main idea...
متن کاملThe DetS and DetMM estimators for multivariate location and scatter
New deterministic robust estimators of multivariate location and scatter are presented. They combine ideas from the deterministic DetMCD estimator with steps from the subsampling-based FastS and FastMM algorithms. The new DetS and DetMM estimators perform similarly to FastS and FastMM on low-dimensional data, whereas in high dimensions they are more robust. Their computation time is much lower ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008